CSMET: Comparative Genomic Motif Detection via Multi-Resolution Phylogenetic Shadowing

نویسندگان

  • Pradipta Ray
  • Suyash Shringarpure
  • Mladen Kolar
  • Eric P. Xing
چکیده

Functional turnover of transcription factor binding sites (TFBSs), such as whole-motif loss or gain, are common events during genome evolution. Conventional probabilistic phylogenetic shadowing methods model the evolution of genomes only at nucleotide level, and lack the ability to capture the evolutionary dynamics of functional turnover of aligned sequence entities. As a result, comparative genomic search of non-conserved motifs across evolutionarily related taxa remains a difficult challenge, especially in higher eukaryotes, where the cis-regulatory regions containing motifs can be long and divergent; existing methods rely heavily on specialized pattern-driven heuristic search or sampling algorithms, which can be difficult to generalize and hard to interpret based on phylogenetic principles. We propose a new method: Conditional Shadowing via Multi-resolution Evolutionary Trees, or CSMET, which uses a context-dependent probabilistic graphical model that allows aligned sites from different taxa in a multiple alignment to be modeled by either a background or an appropriate motif phylogeny conditioning on the functional specifications of each taxon. The functional specifications themselves are the output of a phylogeny which models the evolution not of individual nucleotides, but of the overall functionality (e.g., functional retention or loss) of the aligned sequence segments over lineages. Combining this method with a hidden Markov model that autocorrelates evolutionary rates on successive sites in the genome, CSMET offers a principled way to take into consideration lineage-specific evolution of TFBSs during motif detection, and a readily computable analytical form of the posterior distribution of motifs under TFBS turnover. On both simulated and real Drosophila cis-regulatory modules, CSMET outperforms other state-of-the-art comparative genomic motif finders.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Genomic Motif Detection via Multi-Resolution Phylogenetic Shadowing

Functional turnover of transcription factor binding sites (TFBS), such as whole-motif loss or gain, are common events during genome evolution. Conventional probabilistic phylogenetic shadowing methods model the evolution of genomes only at nucleotide level, and lack the ability to capture the evolutionary dynamics of functional turnover of aligned sequence entities. As a result, comparative gen...

متن کامل

A Model of the Statistical Power of Comparative Genome Sequence Analysis

Comparative genome sequence analysis is powerful, but sequencing genomes is expensive. It is desirable to be able to predict how many genomes are needed for comparative genomics, and at what evolutionary distances. Here I describe a simple mathematical model for the common problem of identifying conserved sequences. The model leads to some useful rules of thumb. For a given evolutionary distanc...

متن کامل

Effective species count and motif efficiency: The value of comparative genomics in characterizing conserved sequence positions

Background: The identification and characterization of functional, non-coding DNA sequence elements is key to the understanding of cell function, differentiation, and pathology because the elements affect when and to what extent nearby genes are expressed. The proliferation of completed genomic sequences during the past few years has provided impetus for numerous comparative-genomics efforts to...

متن کامل

Comparative analysis of the MIR319a microRNA locus in Arabidopsis and related Brassicaceae.

MicroRNAs (miRNAs) are important regulators of gene expression in multicellular organisms. Yet, little is known about their molecular evolution. The 20- to 22-nt long miRNAs are processed in plants from foldbacks that are a few hundred base pairs in size. Often, these foldbacks are embedded in much larger precursor transcripts. To investigate functional constraints on sequence evolution of miRN...

متن کامل

“ Haplotype structure and phylogenetic shadowing of a hypervariable region in the CAPN 10 gene ” Vanessa

Haplotype structure and phylogenetic shadowing of a hypervariable region in the CAPN10 gene. Hum Genet. 117:258-66 p. 1 " Haplotype structure and phylogenetic shadowing of a hypervariable region in the CAPN10 Haplotype structure and phylogenetic shadowing of a hypervariable region in the CAPN10 gene. ABSTRACT It has been proposed that variation in calpain 10 (CAPN10) contributes to the risk of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PLoS Computational Biology

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2008